Parsing

Core Concept

Parsing is the task of predicting a syntactic structure over a sentence—typically a tree that represents constituency (phrase structure) or dependency (head–dependent relations). The output is a rooted tree whose nodes or arcs are labeled (e.g. NP, VP, or subject, object). Valid outputs must satisfy grammatical constraints (e.g. well-formed brackets, single head per word), so parsing is both a structured prediction problem and a constrained decoding problem. Models are trained on treebanks (sentences with gold trees) and evaluated by how well predicted trees match gold (attachment scores, bracket F1). Parsing supports downstream tasks such as information extraction, question answering, and machine translation.

Key Characteristics

  • Tree-shaped outputOutput is a tree (constituency: nested phrases; dependency: directed arcs from heads to dependents). The space of valid trees is defined by a grammar or dependency rules, so decoding must produce only valid structures.
  • Global structureDecisions are globally constrained (e.g. every word has exactly one head in dependency parsing; brackets must nest in constituency). Local classifiers or scorers are combined with dynamic programming or transition systems to enforce validity.
  • Two main formalismsConstituency parsing produces phrase-structure trees (e.g. NP, VP, S); dependency parsing produces trees of head–dependent arcs, one head per word (except root). Conversion between them is possible but not trivial.
  • DecodingConstituency: CKY or chart parsing with a context-free grammar; dependency: dynamic programming (e.g. Eisner) for projective trees, or transition-based (shift-reduce, arc-standard) for general trees. Neural parsers often use transition-based decoding with learned policies.

Common Applications

  • Constituency (phrase-structure) parsingProducing nested phrase brackets and nonterminal labels (e.g. Penn Treebank style); used in grammar checking, extraction of phrase-level semantics.
  • Dependency parsingProducing head–dependent arcs and relation types; used in many NLP pipelines (relation extraction, semantic role labeling, MT).
  • Semantic parsingMapping sentences to meaning representations (e.g. logical form, AMR); output may be a tree or graph.
  • Code parsingProducing ASTs (abstract syntax trees) for source code; supports refactoring, analysis, and code generation.

Parsing Algorithms

Parsing algorithms differ in the formalism (constituency vs dependency), the decoding strategy (dynamic programming vs transition-based vs chart parsing), and whether they use handcrafted grammars or learned scoring (including neural). Choice depends on language, treebank, and whether interpretability (grammar) or accuracy (data-driven) is prioritized.

  • CKY (Cocke–Kasami–Younger) – Dynamic programming for context-free constituency parsing; fills a chart with nonterminals over spans; runs in (O(n^3 \cdot |G|)) for sentence length (n) and grammar size (|G|). Requires a grammar in Chomsky normal form; exact decoding for the grammar.

  • Chart parsing (Earley, etc.) – General chart-based algorithms for constituency parsing with arbitrary CFGs; maintain partial parses and extend incrementally. Flexible with respect to grammar form; used in classic NLP and grammar development.

  • Eisner algorithm – Dynamic programming for projective dependency parsing; finds the maximum spanning tree (MST) or highest-scoring projective dependency tree in (O(n^3)). Used with arc-factored (or higher-order) scorers; exact for projective trees.

  • Transition-based dependency parsing – Build the dependency tree with a sequence of actions (e.g. shift, left-arc, right-arc); a classifier or scorer predicts the next action. No explicit grammar; handles non-projective trees with a stack; fast and widely used (e.g. MaltParser, Stanford parser).

  • Neural transition-based parser – Transition system as above but with neural network (BiLSTM or transformer) for state representation and action scoring. Trained on treebank; achieves state-of-the-art with less feature engineering.

  • Neural chart / constituency parser – Chart-based constituency parsing with neural scoring of spans or rules (e.g. span labeling, or rule scores in a PCFG). Decoding with CKY or chart; parameters learned from treebank. Can be combined with pre-trained encoders (BERT, etc.).

  • Head-driven phrase structure grammar (HPSG) / LFG parsers – Parsers for rich constraint-based grammars (HPSG, LFG); output is a detailed syntactic–semantic structure. Used in grammar engineering and deep linguistic analysis; typically chart-based with handcrafted grammars.

    • Based on: Constraint-based grammar + chart
      • Method Group: Grammar-based parsing